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10 i Introduction 

It is known that a property of a protein largely depends on a three-dimensional 
structure (a tertiary structure). Accordingly, it is necessary to synthesize proteins 
having desired tertiary structures in order to artificially obtain a protein having a desired 
property. At present, however, no method is proposed for accurately grasping what 

1 5 tertiary staicture a protein having an arbitrary amino acid sequence (a primary 
structure) has. For this reason, based on data obtained by a protein structure 
analysis (an analysis based on an X-ray diffracted image, an analysis based on 
magnetic resonance, or an analysis based on an energy calculation method) carried 
out to limited proteins which are relatively low in molecular weight, a correlation 

20 between an amino add sequence and the structure is grasped as a statistic quantity, 
and the structure of the protein having a new amino acid sequence is predicted. 
During this method, it is proposed to calculate not a direct correlation between the 
primary structure and the tertiary structure but the conrelation therebetween with a local 
structure (a secondary structure) in the amino acid sequence as an intermediate 

25 structure of the protein interposing therebetween. Several methods for predicting the 
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correlation between the primary structure and the secondary structure have been 
contrived. However, the number of data obtained by the three-dimensional structure 
analyses to date is small, and both scales and fields of proteins sen/ing as samples are 
not uniform. Therefore, there is no denying that the data is insufficient to be used as 
5 statistic quantities. As a result, these prediction methods have different rules 
according to the manner of interpreting the data obtained by the structure analysis 
[JMCF85] and [SCHU79]. In the present state, an experimental verification is carried 
out based on prediction results obtained from these rules or a more accurate prediction 
method has been studied. 
10 In this report, we will explain configuration and functions of a support expert 

system that can predict the secondary staicture of a protein at a high rate using several 
secondary structural prediction methods which are proposed to date, and can cope 
with development of a new prediction method. 

15 Z Protein Structural Prediction 

The prediction methods explained in Chapter 1 are not complete but quite 
useful in that the methods can be used as indexes of the study [JMCF85]. 

Among the prediction methods for predicting the secondary structure fl^om the 
primary structure which are contrived to date, it is considered that three methods, i.e., 

20 Chou & Fasman method. Robson method, and Nagano method are effective. Each 
of these methods quantitatively evaluates an expression frequency of an amino acid 
based on the fact that it is empirically known that the amino acid firequently appears on 
a specific secondary structure fliom the protein structure analysis result, shows an 
evaluation result by a value called "a trend index*', and predicts the secondary structure 

25 based on the trend index [SCHU79] and [Nagano85]. 
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The following two problems, which should be considered, arise. (1) 
Uncertainties of prediction methods themselves cannot be determined only by 
individually applying the respective prediction methods. (2) There is a probability that 
not only the trend index changes according to an increase in three-dimensional 
5 structure analysis result but also a prediction scheme itself changes by discovery of a 
new rule. These are notable, characteristic problems with research support systems 
of this type. 

3: Protein Secondary Structural Prediction Support Expert System 

10 In order to handle these problems, the following functions are required. (1 ) A 

function of simultaneously applying a plurality of prediction methods, and relatively 
comparing the uncertainty of each prediction method with that of the other prediction 
method. (2) A function of coping with fluidity of knowledge, i.e., addition, change, or 
improvement in precision of the knowledge. 
1 5 We have, therefore, constructed a prototype of a protein secondary structural 

prediction support expert system having these functions on an Al wori<station if 1000. 
The present system collects a plurality of prediction methods, and realizes, as 
independent modules, knowledge on a knowledge base [Yamamoto86]. The present 
system presents results including the uncertainties of the respective prediction 
20 methods, elucidates coincident parts and contradictory parts of the inference results, 
explains a logical ground for each inference result, and each inference process, and 
thereby supports thinking of each researcher. 

3.1 System Configuration and Functions 
25 Fig. 1 depicts logical configuration of the system. 
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Fig. 1 System logical configuration 
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The present system is composed of a knowledge base which stores 
necessary knowledge for the secondary structural prediction, an inference engine that 
controls an inference using this knowledge base, an explanation mechanism that 
outputs inference processes, inference results, and logical grounds for the respective 
5 inference results, a knowledge management mechanism that controls addition and 
correction of the knowledge, and a man-machine interface unit. 

The knowledge base is composed of two independent modules: (1) a 
structural prediction rule formulating unit that formulates a plurality of prediction 
methods Into rules, and (2) a statistic data unit that describes statistically probabilistic 
1 0 data calculated by the structure analyses. At present, the structural prediction rule 
formulating unit (1) formulates the three prediction methods (Chou & Fasman method, 
Robson method, and Nagano method) explained in Chapter 2 into rules. 

The explanation mechanism has (1) explanation functions of explaining the 
inference processes and the logical grounds for the respective inference results, and 
15 (2) a display function of displaying each inference result as an image. 

The knowledge management mechanism has a function of canrying out 
correction of the knowledge already stored in the knowledge base, addition of a 
knowledge to the knowledge base, and the like. 

The man-machine interface unit controls a keyboard and a mouse as user 
20 input, and controls a display as user output. * 

As the inference engine, if 1000 Prolog/SI interpreter is used. 



3.2 Svstem Output 

Figs. 2 and 3 depict examples of outpute of the present system. 
25 Fig. 2 depicts the inference result obtained by the Chou & Fasman method. 
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In Fig. 2, a secondary stmcture region (alpha-helix, beta-sheet, and bent structure) as 
the inference result Is typically expressed, relative to an input target amino add 
sequence. 

Fig. 3 depicts a comparison of the inference results based on the three 
5 prediction methods explained in Chapter 2. In Fig. 3, coincident parts and 

contradictory parts of the secondary structure region obtained from the three inference 
results for the input target amino add sequence are elucidated. 

Fig. 2 Display of inference prediction result by Chou & Fasman method (a helix, p 
1 0 sheet, bent structure) 

AMINO ACID SEQUENCE 
SECONDARY STRUCTURE 
TREND INDEX 

1 5 Fig. 3 Display of comparison of inference results (C: Chou & Fasman method, R: 
Robson method, N: Nagano method) 

4. Embodiment of Explanation Mechanism 

An embodiment of the expert system in respect of the explanation function 
20 required by the system will be explained. 

4. 1 Explanation of I nference Process 

The stnjctural prediction rules in the knowledge base used for the inference 
are displayed in order of ignition. The user can verify correctness of the inference by 
25 viewing the displayed rules. 
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4.2 Explanation of Logical Ground for Inference Result 

Hypotheses (including an initial hypothesis) which are built up during the 
inference process are sequentially displayed until the inference result is obtained. 
5 Details of the indexes (statistically probabilistic quantities including the uncertainty) that 
change in the inference process are displayed. From this display result, a chain of 
what hypotheses are generated in the inference process based on each prediction 
method, and which hypotheses disappear based on indexes given to the respective 
hypotheses is logically clarified. As a result, each ground for which the final inference 
1 0 result is led is made clear, and a difference in logic among the respective prediction 
methods become clear. The user determines what prediction is most probable while 
viewing these logical grounds. 

5, Conclusion 

1 5 The protein secondary structural prediction support expert system includes 

the knowledge base that collects a plurality of prediction methods, and has the fijnction 
of explaining the logical ground for each prediction method. The present system 
provides a criterion for predicting structures of proteins the structures of which are 
unclear and those of unknown proteins, and contributes to initial value setting for 

20 detailed stmcture analysis. In the future, we will intensify the functions of the 

knowledge management mechanism, the explanation mechanism, and the like in the 
prototype cunently under development, and particularly expand the man-machine 
interfece such as a knowledge base editor. 

Further, as for direction of fijture system development, if various pieces of data 

25 are suffidently available following the development of protein engineering, then we 
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expect that the present system can be used as a support system (having the logical 
explanation function, etc.) for contriving a new prediction method, and that the present 
system evolves into a more accurate structural prediction expert system by 
implementing tiie contrived new prediction method into the present system. 
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